census tract
Urban Incident Prediction with Graph Neural Networks: Integrating Government Ratings and Crowdsourced Reports
Balachandar, Sidhika, Sadhuka, Shuvom, Berger, Bonnie, Pierson, Emma, Garg, Nikhil
Graph neural networks (GNNs) are widely used in urban spatiotemporal forecasting, such as predicting infrastructure problems. In this setting, government officials wish to know in which neighborhoods incidents like potholes or rodent issues occur. The true state of incidents (e.g., street conditions) for each neighborhood is observed via government inspection ratings. However, these ratings are only conducted for a sparse set of neighborhoods and incident types. We also observe the state of incidents via crowdsourced reports, which are more densely observed but may be biased due to heterogeneous reporting behavior. First, for such settings, we propose a multiview, multioutput GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state of incidents. Second, we investigate a case study of New York City urban incidents and collect, standardize, and make publicly available a dataset of 9,615,863 crowdsourced reports and 1,041,415 government inspection ratings over 3 years and across 139 types of incidents. Finally, we show on both real and semi-synthetic data that our model can better predict the latent state compared to models that use only reporting data or models that use only rating data, especially when rating data is sparse and reports are predictive of ratings. We also quantify demographic biases in crowdsourced reporting, e.g., higher-income neighborhoods report problems at higher rates. Our analysis showcases a widely applicable approach for latent state prediction using heterogeneous, sparse, and biased data.
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > New York > Bronx County > New York City (0.04)
- Asia > Bangladesh (0.04)
- Africa > Comoros > Grande Comore > Moroni (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Transportation > Ground > Road (0.67)
- Transportation > Infrastructure & Services (0.67)
- Government > Regional Government > North America Government > United States Government (0.47)
- (2 more...)
- Information Technology > Communications > Social Media > Crowdsourcing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Modeling Urban Food Insecurity with Google Street View Images
F ood insecurity is a significant social and public health issue that plagues many urban metropolitan areas around the world. Existing approaches to identifying food insecurity rely primarily on qualitative and quantitative survey data, which is difficult to scale. This project seeks to explore the effectiveness of using street-level images in modeling food insecurity at the census tract level. T o do so, we propose a two-step process of feature extraction and gated attention for image aggregation. W e evaluate the effectiveness of our model by comparing against other model architectures, interpreting our learned weights, and performing a case study. While our model falls slightly short in terms of its predictive power, we believe our approach still has the potential to supplement existing methods of identifying food insecurity for urban planners and policymakers.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom (0.04)
- Africa > Kenya > Nairobi City County > Nairobi (0.04)
Commute Networks as a Signature of Urban Socioeconomic Performance: Evaluating Mobility Structures with Deep Learning Models
Khulbe, Devashish, Belyi, Alexander, Sobolevsky, Stanislav
Urban socioeconomic modeling has predominantly concentrated on extensive location and neighborhood-based features, relying on the localized population footprint. However, networks in urban systems are common, and many urban modeling methods don't account for network-based effects. In this study, we propose using commute information records from the census as a reliable and comprehensive source to construct mobility networks across cities. Leveraging deep learning architectures, we employ these commute networks across U.S. metro areas for socioeconomic modeling. We show that mobility network structures provide significant predictive performance without considering any node features. Consequently, we use mobility networks to present a supervised learning framework to model a city's socioeconomic indicator directly, combining Graph Neural Network and Vanilla Neural Network models to learn all parameters in a single learning pipeline. Our experiments in 12 major U.S. cities show the proposed model outperforms previous conventional machine learning models. This work provides urban researchers methods to incorporate network effects in urban modeling and informs stakeholders of wider network-based effects in urban policymaking and planning.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- (6 more...)
- Banking & Finance > Economy (0.68)
- Transportation > Ground > Road (0.68)
- Transportation > Infrastructure & Services (0.46)
Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection
Franchi, Matt, Garg, Nikhil, Ju, Wendy, Pierson, Emma
Street scene datasets, collected from Street View or dashboard cameras, offer a promising means of detecting urban objects and incidents like street flooding. However, a major challenge in using these datasets is their lack of reliable labels: there are myriad types of incidents, many types occur rarely, and ground-truth measures of where incidents occur are lacking. Here, we propose BayFlood, a two-stage approach which circumvents this difficulty. First, we perform zero-shot classification of where incidents occur using a pretrained vision-language model (VLM). Second, we fit a spatial Bayesian model on the VLM classifications. The zero-shot approach avoids the need to annotate large training sets, and the Bayesian model provides frequent desiderata in urban settings - principled measures of uncertainty, smoothing across locations, and incorporation of external data like stormwater accumulation zones. We comprehensively validate this two-stage approach, showing that VLMs provide strong zero-shot signal for floods across multiple cities and time periods, the Bayesian model improves out-of-sample prediction relative to baseline methods, and our inferred flood risk correlates with known external predictors of risk. Having validated our approach, we show it can be used to improve urban flood detection: our analysis reveals 113,738 people who are at high risk of flooding overlooked by current methods, identifies demographic biases in existing methods, and suggests locations for new flood sensors. More broadly, our results showcase how Bayesian modeling of zero-shot LM annotations represents a promising paradigm because it avoids the need to collect large labeled datasets and leverages the power of foundation models while providing the expressiveness and uncertainty quantification of Bayesian models.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Texas > Galveston County > Galveston (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- (9 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.87)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Transportation > Ground (0.93)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Causal Discovery and Inference towards Urban Elements and Associated Factors
Feng, Tao, Zhang, Yunke, Fan, Xiaochen, Wang, Huandong, Li, Yong
To uncover the city's fundamental functioning mechanisms, it is important to acquire a deep understanding of complicated relationships among citizens, location, and mobility behaviors. Previous research studies have applied direct correlation analysis to investigate such relationships. Nevertheless, due to the ubiquitous confounding effects, empirical correlation analysis may not accurately reflect underlying causal relationships among basic urban elements. In this paper, we propose a novel urban causal computing framework to comprehensively explore causalities and confounding effects among a variety of factors across different types of urban elements. In particular, we design a reinforcement learning algorithm to discover the potential causal graph, which depicts the causal relations between urban factors. The causal graph further serves as the guidance for estimating causal effects between pair-wise urban factors by propensity score matching. After removing the confounding effects from correlations, we leverage significance levels of causal effects in downstream urban mobility prediction tasks. Experimental studies on open-source urban datasets show that the discovered causal graph demonstrates a hierarchical structure, where citizens affect locations, and they both cause changes in urban mobility behaviors. Experimental results in urban mobility prediction tasks further show that the proposed method can effectively reduce confounding effects and enhance performance of urban computing tasks.
- Asia > China (0.47)
- North America > United States > New York (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Africa (0.14)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Transportation > Infrastructure & Services (0.69)
- Government (0.68)
- Health & Medicine > Therapeutic Area (0.46)
Analyzing Geospatial and Socioeconomic Disparities in Breast Cancer Screening Among Populations in the United States: Machine Learning Approach
Hashtarkhani, Soheil, Zhou, Yiwang, Kumsa, Fekede Asefa, White-Means, Shelley, Schwartz, David L, Shaban-Nejad, Arash
Breast cancer screening plays a pivotal role in early detection and subsequent effective management of the disease, impacting patient outcomes and survival rates. This study aims to assess breast cancer screening rates nationwide in the United States and investigate the impact of social determinants of health on these screening rates. Data on mammography screening at the census tract level for 2018 and 2020 were collected from the Behavioral Risk Factor Surveillance System. We developed a large dataset of social determinants of health, comprising 13 variables for 72337 census tracts. Spatial analysis employing Getis-Ord Gi statistics was used to identify clusters of high and low breast cancer screening rates. To evaluate the influence of these social determinants, we implemented a random forest model, with the aim of comparing its performance to linear regression and support vector machine models. The models were evaluated using R2 and root mean squared error metrics. Shapley Additive Explanations values were subsequently used to assess the significance of variables and direction of their influence. Geospatial analysis revealed elevated screening rates in the eastern and northern United States, while central and midwestern regions exhibited lower rates. The random forest model demonstrated superior performance, with an R2=64.53 and root mean squared error of 2.06 compared to linear regression and support vector machine models. Shapley Additive Explanations values indicated that the percentage of the Black population, the number of mammography facilities within a 10-mile radius, and the percentage of the population with at least a bachelor's degree were the most influential variables, all positively associated with mammography screening rates.
- North America > United States > Tennessee > Shelby County > Memphis (0.05)
- North America > United States > Texas (0.04)
- North America > United States > New York (0.04)
- (9 more...)
Improving the Fairness of Deep-Learning, Short-term Crime Prediction with Under-reporting-aware Models
Wu, Jiahui, Frias-Martinez, Vanessa
Deep learning crime predictive tools use past crime data and additional behavioral datasets to forecast future crimes. Nevertheless, these tools have been shown to suffer from unfair predictions across minority racial and ethnic groups. Current approaches to address this unfairness generally propose either pre-processing methods that mitigate the bias in the training datasets by applying corrections to crime counts based on domain knowledge or in-processing methods that are implemented as fairness regularizers to optimize for both accuracy and fairness. In this paper, we propose a novel deep learning architecture that combines the power of these two approaches to increase prediction fairness. Our results show that the proposed model improves the fairness of crime predictions when compared to models with in-processing de-biasing approaches and with models without any type of bias correction, albeit at the cost of reducing accuracy.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (6 more...)
Network-Based Transfer Learning Helps Improve Short-Term Crime Prediction Accuracy
Wu, Jiahui, Frias-Martinez, Vanessa
Deep learning architectures enhanced with human mobility data have been shown to improve the accuracy of short-term crime prediction models trained with historical crime data. However, human mobility data may be scarce in some regions, negatively impacting the correct training of these models. To address this issue, we propose a novel transfer learning framework for short-term crime prediction models, whereby weights from the deep learning crime prediction models trained in source regions with plenty of mobility data are transferred to target regions to fine-tune their local crime prediction models and improve crime prediction accuracy. Our results show that the proposed transfer learning framework improves the F1 scores for target cities with mobility data scarcity, especially when the number of months of available mobility data is small. We also show that the F1 score improvements are pervasive across different types of crimes and diverse cities in the US.
- North America > United States > Illinois > Cook County > Chicago (0.06)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.06)
- North America > United States > New York > New York County > New York City (0.05)
- (3 more...)
Advancing Transportation Mode Share Analysis with Built Environment: Deep Hybrid Models with Urban Road Network
Zhuang, Dingyi, Wang, Qingyi, Zheng, Yunhan, Guo, Xiaotong, Wang, Shenhao, Koutsopoulos, Haris N, Zhao, Jinhua
Transportation mode share analysis is important to various real-world transportation tasks as it helps researchers understand the travel behaviors and choices of passengers. A typical example is the prediction of communities' travel mode share by accounting for their sociodemographics like age, income, etc., and travel modes' attributes (e.g. travel cost and time). However, there exist only limited efforts in integrating the structure of the urban built environment, e.g., road networks, into the mode share models to capture the impacts of the built environment. This task usually requires manual feature engineering or prior knowledge of the urban design features. In this study, we propose deep hybrid models (DHM), which directly combine road networks and sociodemographic features as inputs for travel mode share analysis. Using graph embedding (GE) techniques, we enhance travel demand models with a more powerful representation of urban structures. In experiments of mode share prediction in Chicago, results demonstrate that DHM can provide valuable spatial insights into the sociodemographic structure, improving the performance of travel demand models in estimating different mode shares at the city level. Specifically, DHM improves the results by more than 20\% while retaining the interpretation power of the choice models, demonstrating its superiority in interpretability, prediction accuracy, and geographical insights.
- North America > United States > Florida > Alachua County > Gainesville (0.28)
- North America > United States > Illinois > Cook County > Chicago (0.25)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- (5 more...)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Deep Learning-Based Weather-Related Power Outage Prediction with Socio-Economic and Power Infrastructure Data
Wang, Xuesong, Fatehi, Nina, Wang, Caisheng, Nazari, Masoud H.
This paper presents a deep learning-based approach for hourly power outage probability prediction within census tracts encompassing a utility company's service territory. Two distinct deep learning models, conditional Multi-Layer Perceptron (MLP) and unconditional MLP, were developed to forecast power outage probabilities, leveraging a rich array of input features gathered from publicly available sources including weather data, weather station locations, power infrastructure maps, socio-economic and demographic statistics, and power outage records. Given a one-hour-ahead weather forecast, the models predict the power outage probability for each census tract, taking into account both the weather prediction and the location's characteristics. The deep learning models employed different loss functions to optimize prediction performance. Our experimental results underscore the significance of socio-economic factors in enhancing the accuracy of power outage predictions at the census tract level.
- North America > United States > New York (0.05)
- North America > United States > Michigan > Wayne County > Detroit (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Government > Regional Government > North America Government > United States Government (0.69)
- Energy > Power Industry > Utilities (0.66)